-
Notifications
You must be signed in to change notification settings - Fork 917
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add packed serialization option for dataframes #8661
Add packed serialization option for dataframes #8661
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@charlesbluca this looks fine overall but I am wondering how we plan on testing this? Seems we have some tests for serialization that might suffice from the perspective of coverage, but you'd have to inject the environment variable in both the true and false state during the test session somehow.
That's a good point I didn't consider - I imagine we could either
|
Something like this may help https://docs.pytest.org/en/latest/how-to/monkeypatch.html |
Codecov Report
@@ Coverage Diff @@
## branch-21.08 #8661 +/- ##
================================================
+ Coverage 10.60% 10.62% +0.02%
================================================
Files 109 109
Lines 18280 18292 +12
================================================
+ Hits 1938 1943 +5
- Misses 16342 16349 +7
Continue to review full report at Codecov.
|
Converting this to draft as there are currently a lot of issues with pack/unpack serialization that need to be worked out before this option is really useful for its intended purpose (benchmarking). |
Moving this to |
Closing this as we haven't really found a place for pack/unpack serialization yet - will reopen if this changes in the future |
Now that we have a way to config cuDF ( #11193 ), is this worth revisiting? |
Adds the option to enable pack/unpack for dataframe serialization using the environment variable
CUDF_PACKED_SERIALIZATION
. This is intended to be a temporary solution to aid in benchmarking, which will eventually be replaced by a config module discussed in #5311.